Speaker Identification and Spoken word Recognition in Noisy Environment using Different Techniques
نویسنده
چکیده
In this work, an attempt is made to design ASR systems through software/computer programs which would perform Speaker Identification, Spoken word recognition and combination of both speaker identification and Spoken word recognition in general noisy environment. Automatic Speech Recognition system is designed for Limited vocabulary of Telugu language words/control commands. The experiments are conducted to find the better combination of feature extraction technique and classifier model that will perform well in general noisy environment (Home/Office environment where noise is around 15-35 dB). A recently proposed features extraction technique Gammatone frequency coefficients which is reported as the best fit to the human auditory system is chosen for the experiments along with the more common feature extraction techniques MFCC and PLP as part of Front end process (i.e. speech features extraction). Two different Artificial Neural Network classifiers Learning Vector Quantization (LVQ) neural networks and Radial Basis Function (RBF) neural networks along with Hidden Markov Models (HMMs) are chosen for the experiments as part of Back end process (i.e. training/modeling the ASRs). The performance of different ASR systems that are designed by utilizing the 9 different combinations (3 feature extraction techniques and 3 classifier models) are analyzed in terms of spoken word recognition and speaker identification accuracy success rate, design time of ASRs, and recognition / identification response time .The testing speech samples are recorded in general noisy conditions i.e.in the existence of air conditioning noise, fan noise, computer key board noise and far away cross talk noise. ASR systems designed and analyzed programmatically in MATLAB 2013(a) Environment.
منابع مشابه
Robust Text-independent Speaker Identification in a Time-varying Noisy Environment
Practical speaker recognition systems are often subject to noise or distortions within the input speech which degrades performance. In this paper, we proposed a new mel-frequency cepstral coefficients (MFCC) based speaker identification system with Vector Quantization (VQ) modeling technique. It integrates a hearing masking effect based masker and a group of dozen triflers into traditional MFCC...
متن کاملText Dependent Speaker Identification System using Discrete HMM in Noise
In this paper, an improved strategy for automated text dependent speaker identification system has been proposed in noisy environment. The identification process incorporates the Hidden Markov Model technique with cepstral based features. To remove the background noise from the source utterance, wiener filter has been used. Different speech pre-processing techniques such as start-end point dete...
متن کاملSpeaker Accent and Isolated Kannada Word Recognition
Algorithm is designed for isolated Kannada word recognition of five districts Kannada speakers’ accent. Isolated Kannada words recognition is designed using the syllables, Baum-Welch algorithm and Normal fit method. The novelty of proposed method is in recognition of five district Kannada speaker accents as well as spoken words. Our model is compared with baseline Hidden Markov Model (HMM) and ...
متن کاملClosed-Set Speaker Identification Based on a Single Word Utterance: An Evaluation of Alternative Approaches
The problem of closed-set speaker identification based on a single spoken word from a limited vocabulary is relevant to several current and futuristic interactive multimedia applications. In this paper, we evaluate the effectiveness of several potential solutions using an isolated word speech corpus. In addition to evaluating the text-dependent and text-constrained variants of the Gaussian Mixt...
متن کاملSpeaker Identification in Noisy Environment with Use of the Precise Model of the Human Auditory System
This paper discusses an approach for speaker identification in noisy environment using the multi-dimensional pulse signals generated from the model of a human peripheral auditory system. The peripheral auditory model employed here consists of a basilar membrane, hair cells, and auditory nerves. The input to this model is a speech signal divided into frames, and the outputs of which are the mult...
متن کامل